Live freelance tracking. Raw descriptions turned into structured data. Find your next tech project without the noise.
freelancer.com 🟡 2026-05-10
🔹 Regularly extract text content from multiple websites into a structured format.
👤 Client: 🇬🇧 Paris, United Kingdom Member since 2026-05-10
💰 Price: $21 / hr Average bid
🚩 Problem: Need to automate the process of extracting website text while maintaining structure and ensuring compliance with scraping rules.
📦 Existing: Not specified
Specifications:
[Target] - Websites: URLs for specific pages to be extracted.
[Method] - Python, Scrapy, BeautifulSoup, Selenium (as needed).
[UI/UX] - Not applicable.
[Stack] - Python, Scrapy, BeautifulSoup, Selenium, Pandas, Openpyxl.
[Security] - Respect [robots.txt], honor rate limits, and ensure data privacy.
[Format] - Clean Excel workbook with consistent formatting.
Workflow:
1. Confirm URLs and specific page sections to target.
2. Develop a scraping script using Python/Scrapy/BeautifulSoup/Selenium as needed.
3. Extract text, preserve structure (titles, sub-headings, body copy, meta information).
4. Clean data by stripping HTML tags, preserving line breaks, and applying agreed-upon encoding rules.
5. Export to a clean Excel workbook using Openpyxl for consistent formatting.
6. Provide an initial sample run for approval.
7. Schedule daily or weekly runs based on agreement.
8. Monitor and log any anomalies or access issues during the month.